CONSTRAINT SATISFACTION BY SURVEY PROPAGATION 
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, Survey Propagation (SP) is an algorithm designed for solving typical instances of random con- 

straint satisfiability problems. It has been successfully tested on random 3-satisfiability (3-SAt) and 
random Q(n, — ) graph 3-coloring (3-COl), in the hard region of the parameter space, relatively close 
the the SAT/UNSAT phase transition. Here we provide a generic formalism which applies to a wide 
C . class of discrete Constraint Satisfaction Problems. 
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I. INTRODUCTION 



In this paper we suggest a new theoretical framework for the so called "Survey Propagation" (SP) equations that are 
at the root of both the analysis and the algorithms used in ref. 0,0, 0,13 to solve the random 3-Sat and q-COLOMNG 
problems. In the more general context of constraint satisfaction problems we propose a slightly different way of deriving 
the equations which we hope can shed some light on the potentialities of the algorithms and which makes clear the 
differences with other well known iterative probabilistic algorithms. This line of approach, also discussed in p| for 
the satisfiability problem, is developed here systematically through the addition of an extra state for the variables 
which allows to take care of the clustered structure of the space of solutions. Within clusters a variable can be either 
"frozen" to some value - that is, the variable takes always the same value for all solutions (satisfying assignments) 
within the cluster - or it may be "unfrozen" - that is it fluctuates from solution to solution within the cluster. As 
\f} • we shall discuss, scope of the SP equations is to properly describe the cluster to cluster fluctuations by associating 
■^j- I to unfrozen variables an extra state to be added to those belonging to the original definition of the problem. The 
£S| overall algorithmic strategy is iterative and decomposable in two elementary steps: First, the marginal probabilities 
of frozen variables are evaluated by the SP message-passing procedure; Second - the so called decimation step - using 
such information some variables are fixed and the problem is simplified. While the first step is unavoidable if one is 
interested in marginal probabilities, the second step is just dictated by simplicity and we expect that there could exist 
other ways of efficiently using the information provided by the marginals. 

Throughout the paper, a detailed comparison with a similar message-passing procedure, Belief Propagation, which 
does not make assumptions about the structure of the solution space will also be given. 
■ The structure of the paper is as follows. In Sec. [HI we provide the general formalism, namely the definitions of 
Constraint Satisfaction Problems, Factor Graphs and Cavities, with concrete reference to the cases of Coloring and 
Satisfiability. In Sec. IIHI we introduce the warnings and the local fields whose histograms will provide the so called 
J-^ ' Belief Propagation equations. Finally in Sec. IIVI clusters are introduced and the SP equations are derived. Explicit 
equations are given for both 3-COL and 3-SAT and the decimation procedure is discussed. 



II. GENERALITIES 
A. Constraint satisfaction problems 



We consider a constraint satisfaction problem (CSP) which is defined on a set of discrete variables x — (xi) j £ j with 
I = {1, . . . ,n}. Each variable Xi can be in q possible states (the generalization to the case where the number of states 
is i— dependent is straightforward), so x G X = {1, . . . , q} n . The vector x is called a configuration. These variables 
are subject to a set of constraints {C a }aeA- Each C' a depends on x only through a subset (xi)j 6 /( a ) of variables. It is 

defined as a mapping C a : {1, . . . , q}^ 1 ^ — > {0, 1}, where the value C a — zero corresponds to a satisfied constraint, 
and C a = 1 to an unsatisfied constraint. It is useful to introduce, for every i £ I, the subset A(i) C A of indices of 
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all constraints involving Xi. The index sets I and A are chosen disjoint, so that their elements uniquely determine a 
single variable or constraint. 
We define the cost function 

C[x] = Y J Ca[{x l ) ieI[a) ] (1) 
a£A 

which counts the number of unsatisfied constraints. Our goal is to simultaneously satisfy all constraints, i.e. to find 
a configuration s £ X with C[s\ = 0. We thus introduce the subset Sc C X of solutions to our CSP instance as 

S c = {s | seX, C[s\=0} . (2) 

The algorithm aims at finding one solution s £ Sc- We concentrate a priori onto instances C[x] which possess a 
non-empty solution set Sc- 

B. Factor graph 

We use the factor-graph |(| representation for a CSP: 

Definition II. 1 For any instance of the CSP problem, we define its factor graph as a bipartite undirected graph 
G = (V, E) , having two types of nodes: 

• variable nodes i £ I and 

• function nodes a £ A. 

Edges connect only different node types; the edge (i, a) belongs to the graph if and only if the constraint C a in- 
volves the variable a;,-, i.e. if a £ A(i) or equivalently i G 1(a). More formally, we define V = A U I and 
E = {(i, a) | i £ I,ae A(i)} = {(i, a) | a G A, i G 1(a)}. 

In figures, we always represent variable nodes by circles, whereas function nodes are drawn as squares. This notation 
will help to distinguish between the different origins of the two node types. 

C. Cavities 

Given a CSP and its factor graph, we will use the cavity graphs obtained by removing a variable: 

Definition II. 2 Given a factor graph G and one variable node i £ I , we define the cavity graph G^ by deleting 
from G all function nodes a £ A(i) which are adjacent to i, and the edges incident to these function nodes. 

The cavity graph G^' defines a new CSP, where the cost function is 

c« = c - c - = E c " • ( 3 ) 

a£A(i) b£A(i) 

Note that in this new problem the variable x% is isolated, it can take any value without violating a constraint. The 
solution set S^ for the cavity problem C?M is larger than the original one, since some constraints have been removed. 

D. Two examples: Satisfiability and Coloring 

Although the algorithm can in principle be written for arbitrary CSP, we shall present two specific examples, 
satisfiability and coloring. 

In the satisfiability problem a constraint C a is a clause, which is unsatisfied by only one assignment of the variables 
( x i)ie.i(a)- I n the random 3-SAT problem each clause involves three variables (|/(a)| = 3), the indices of which are 
chosen randomly with uniform distribution in /. For a given a and 1(a), there are eight different types of constraints 
C a , corresponding to the combinations of possible negations of literals in one clause, see Fig. In random 3-SAT 
the type of clauses is chosen with uniform distribution among these eight types. 
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Figure 1: The factor graph of a 3-SAT problem corresponding to the simple formula: (x\ V12V X3) A (x2 V 23 V 2J4). Variables 
are represented as circles, clauses (i.e. function nodes) as squares. A triangle-shaped mark indicates that the corresponding 
literal is negated 




Figure 2: The original graph (left) and its factor graph (right) corresponding to a coloring problem 



In the (/-coloring problem one is given an original undirected graph. The problem is to color the vertices, using 
q colors, so that two vertices connected by an edge have different colors. There is one constraint associated with 
each edge of the original graph, and the factor-graph appears as a decoration of the original graph (see Fig|2J, where 
function nodes have been added on each original edge. There is only one type of function node. In the random g-COL 
problem, the original graph is a random Q{n, — ) graph. 

We will be particularly interested in the behavior of the algorithm for large n. Note that both K-SAT and g-COL 
are problems where |^4(z) | have a Poisson limit distribution with finite mean when n — > 00, i.e. |A(z)| is typically much 
smaller than n. Moreover, the structure of the factor graph is locally tree-like. This will guide us in the definition of 
the algorithm below, and it is presumably an important ingredient for the algorithm to work. 

III. BELIEF PROPAGATION 

A. Warnings and fields 

Given a CSP and a configuration x <E X, we define the following three quantities associated with x, cf. 
Definition III.l For a given edge a — i of the factor graph, with i £ I and a £ A(i), we define a warning as the 
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q-component vector u a —^(x) 6 {Ojl} 9 with components: 

U l^i(%) = C a [( x j)jel(a) I %i <~ P] , P = 1, ■■■,9, 

So u^^ix) is the value of constraint C a in the configuration obtained from x, by substituting p in the place of Xj. 

Definition III. 2 For a given edge a — i of the factor graph, with i £ I and a 6 A(i), we define a cavity field as the 

q-component vector hi^ a (x) 6 {0, l} q with components: 

h^Jx) = max 

b£A(t)\a 

Definition III. 3 For a given node i £ I, we define the local field as the q-dimensional vector hi{x) £ {0, l} q with 
components: 

hP(x) = max U ^,(f) . 

The warning u a ^i{x) is understood as a message sent from constraint a to variable i saying: x% cannot be in any 
of the states p where u^^^x) = 1, without violating constraint a. Note that we do not need the value of Xi for 
computing u a ^i- In fact, the warning depends explicitly only on (xj)j e i( a )\i. 

The local field on variable i summarizes all warnings sent to i from the constraints, i.e. /if (a?) ^ means that, 
given the values of all other variables [xj),j ^ i, the variable Xi should not be assigned the value p, because at least 
one neighboring constraint would be violated. 

The cavity field hi^ a (x) summarizes all warnings sent to i from the constraints different from a. 



B. Histograms 

The elementary messages above are defined for an arbitrary configuration x. We are eventually interested in 
knowing, for each variable i, the histogram of local fields for the configurations which are solutions of the CSP: 

= Tsb £ • (4) 

seSc 

where the (g-dimensional) Kronecker-Delta is simply denoted by S. This histogram can also be interpreted as proba- 
bility distributions Hi(h) = Prob(hi(s) = h \ sG Sc) of local fields for randomly chosen solutions. 

Local-field histograms contain useful information about the set Sc of solutions, which can be exploited algorithmi- 
cally in order to recursively construct one solution. If, e.g., one of the field components is non-zero for all solutions 
s G Sc, this particular state is forbidden to this variable. If all but one components are non-zero, the variable is 
"frozen" to one specific value in all solutions, i.e. it belongs to the so-called backbone, and it can be assigned right 
away. 

Computing Hi(h) is a difficult task, but one may compute it approximately using a message passing procedure. We 
first try to find a recursion relation for the related histograms of the warnings u a ^i(s) over all solutions s £ Sc 0- 
Considering Fig. [21 as an example, we note that the histogram of u a ^i(s) depends on the joint histogram of all the 
warnings Ub^j{s) sent to all variables j £ {ji, j2> J3} "above" function node a (we call them the incoming warnings). 
The obvious problem is that this joint distribution is not known. If the Ub_»j(s) were independent variables, we would 
be able to factorize the joint histogram into the product of all individual histograms of warnings Ub—>j{s), and then 
to obtain a recursion. But in general there is no reason for them to be independent. Moreover, they cannot even 
be approximately independent as there are very short paths joining variables "above" variable nodes j (the small 
unnamed ones in the figure) between them, variables which in turn define the Ub^j(s) messages. This is where the 
cavity graph is useful. 

For each edge a — i of the factor graph, we define the belief B a ^i (u) as the histogram of the warning u a _»i over the 
configurations s £ S^' which are solutions of the cavity graph problem: 

B a ^i(u) = — — 6 uM a ^{s) 

= Prob (ti a ^i(s) = u I s£ ST®) , (5) 

where the second line refers again to the probabilistic interpretation: B a ^i (u) describes the probability of finding a 
warning u if a solution of the cavity graph is randomly selected. 




C. Belief propagation equations 



Look again at Fig|3] If the factor graph G is a tree, vertices above ji , ji and J3 become disconnected if function 
nodes bi are removed, and the various messages Ub^j are uncorrelated. In this case, we can thus determine the belief 
B a ^i as a function of all the incoming beliefs (i.e. the histograms of the incoming warnings {B^j} with j G I (a) \ i 
and b G A (j) \ a), and so on recursively for the full factor graph. Standard belief propagation uses this same recursion 
also in more general factor graphs with loops, as a means to compute approximately the local-field histograms and 
the beliefs (see e.g. 0). 

In order to write the corresponding 'belief propagation' equations explicitly, we use notations similar to those of 
FigEl Given the edge a — i connecting the function node a to the variable i, we denote by J the set of indices of 
the variable nodes "above" the function node a, i.e. J = I (a)\i (in the figure J = {ji, j 2 , J3}). For each j G J, we 
denote by Dj — A (j) \ a the set of function node "above" the variable j (in the figure, Dj 1 = {61, 62}, and by D the 
union of these sets: D = (J. g Dj). The "incoming messages", which can be warning or beliefs, are all the messages 
propagated on the edges b — > j with j £ J and, for each such j, b G Dj. 

Let us first consider a set of incoming warnings {ub^j}- This warning set may or may not be "extensible" to 
a configuration (sj)jej satisfying all constraints (C&)j, e £- One can easily go through a bureaucratic procedure to 
evaluate all configurations (sj)j^j compatible with the warning set. First compute the cavity fields ITTT731 component - 
wise: h*_> a = maXbeD, { u b->j ) ■ For each j e J, the allowed values of Sj are those such that hjL> a — 0. We denote by 

T({hj^ a }) C {1, <7}' J ' the set of allowed configurations of the Sj variables: 

T({hj^ a }) = {(sj)jei(a)\i I h^ a = 0, Vj G 7(a) \ 1} (6) 

For each (sj) in T({/ij_, a }), one can determine the output warning u a ~*i using definition lIII.il 

This procedure can be embedded into the probabilistic description of solutions on the cavity graph. For doing so, 
we assume that incoming warnings are independent. Following the steps above, one first calculates from the incoming 
beliefs the distributions of cavity fields 

H^ a (h) = £ 5- h h _ J] B b ^j(u b ^j) . (7) 

{S b ^j} beDj beDj 

The new distribution of warnings u a ^i is now given by an average over cavity fields, 



{hj^ a } jeJ 



]jBj^ a (hj^ a ) (8) 



The prefactor Z 1 is a normalization constant. Note that each cavity-field configuration {hj—, a } is contributing 
\T({hj^ a })\ terms. As a byproduct, contradictory messages automatically do not contribute anything to ©. 
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Figure 4: An example of the coloring problem. This part of the factor graph is the one necessary to compute the messages 
(warning and belief) passed from the function node a to the variable node i. 



The BP equations 1(718(1 are equivalent to the so called sum-product (or belief network, or Bayesian network) 
equations One can try to solve them by iteration, starting form some randomly chosen beliefs, and updating 

B a ^i sequentially on randomly chosen a — i edges. In some cases, the iteration converges, independently of the scheme 
of updating, to a unique solution. When the belief propagation equations converge, one can use the obtained beliefs 
in order to estimate the histogram of local fields, using: 

H S)^ E 5 h,h^ a II B b ^{u b ^). (9) 
and this histogram can be used for decimation. 



D. An example of Belief Propagation: 3-COL 

For the sake of clarity, let us work out BP on a simple example of the 3-COL problem (q = 3) , see Fig. 0J Since 
function nodes are connected to two variable nodes only (constraints are edges in the original graph), there is only 
one variable node j above function node a. For a given configuration of incoming warnings {u b ^j}, we can make a 
table of allowed states Sj, and for each of them, we can compute the outgoing warning u a ^i(sj). Note that possible 
warnings are (1, 0, 0), (0, 1, 0), (0, 0, 1), since a function node can only forbid one color (which is given by the state of 
the other variable connected to the function node). 

• Suppose that Ubt^j = (1,0,0), Ub 2 ^j — (0,1,0), and Ub 3 ^j — (0,0,1). Then hj^ a = (1,1,1), we find a 
contradictory message. No satisfiable configuration exists for sj. According to the procedure given above, this 
configuration does not contribute. 

• Suppose that incoming messages are Ub^j = u b2 ^j = (1, 0, 0), and Ub 3 ^j — (0, 1, 0). Then hj—> a = (1, 1, 0), and 
the only possible coloring state for j is Sj = 3. For this configuration, we thus have only one possible outgoing 
warning, u a ^i = (0,0, 1). 

• If Wfjj^j = Ub 2 ^j = Ub 3 ^j = (1,0,0), then hj^ a = (1,0,0), and there are two possible colors for sj, namely 
states 2 and 3. For the first one we have u a ^i = (0, 1, 0), and for the second one u a ^i — (0, 0, 1). Both contribute 
with equal weight to B a ^. 

• All other configurations are simple color permutations of the three cases mentioned above, and are handled 
analogously. 

From Eqs. ((7I8|) , we can easily deduce the equation giving the probability distribution B a — >i in terms of all distributions 
{Bb^j; I = 1,2, 3}. Parameterizing i? Q _>i according to the three possible messages as 

B a ^t(u) = nl^u, (1,0,0) + VI^U,(0A,0) + ^^,(0,0,1) ) (10) 
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we find 

P = nL(i-<^) 

This expression can be easily understood: ff a _^ equals the probability that color p is forbidden for node i, which 
means that node j has already taken this color, Sj — p. Now, node j can take color p if and only if it is not forbidden 
by any incoming warning: The numerator in Eq. simply calculates the probability that none of the incoming 

messages forbids color p, the denominator guarantees normalization. Note that configurations in which all variables 
above j take the same color r, are counted twice, namely in the expressions for both p ^ r. According to the general 
discussion given above, this is correct because we have two new configurations for Sj, and two corresponding messages 
u a —>i can be sent. 

Due to the symmetry between colors, a possible solution would be rfc^ = 1/q for all edges (i, a) € E and all colors 
p. Note, however, that the main intention is to use these equations in a recursive coloring algorithm. This means, 
some variable nodes may already be assigned a color before, which explicitly breaks the symmetry. Still, Eq. is 
valid. 



IV. SURVEY PROPAGATION 



A. Clustering 



Pitifully, the Belief Propagation dynamics is known not to converge for the random version of many combinatorial 
problems (including again 3-SAT and g-COL) in the region of the parameters near the SAT/UNSAT threshold. Recently, 
using tools from statistical physics, it has been possible to reach some understanding of what happens in the solution 
space of these problems around this threshold 0,0, E|- Well below the threshold, i.e. where the number \A\/n of 
constraints per variable is relatively small, a generic problem has exponentially many solutions, which tend to form 
one giant cluster: For any two solutions, it is possible to find a connecting path via other solutions that requires short 
steps only (each pair of consecutive assignments in the path is close together in Hamming distance). 

Close to the critical threshold, however, the solution space breaks up into many smaller clusters. Solutions in 
separate clusters are generally far apart. In addition, the cost function C[x] has exponentially many local minima, 
separated from each other by large cost "barriers" . These local cost minima are exponentially more numerous than 
the solution clusters, and they act as traps for local search algorithms. 

According to the statistical-physics analysis (which considers the infinite size limit, n — > oo), there exist exponen- 
tially many widely separated clusters of solutions. Within one such cluster of solutions, we may identify two types 
of variables: those which are frozen in one single state, for all configurations belonging to the cluster, and those - 
unfrozen - which fluctuate from solution to solution inside the cluster. Note that also the variables which are frozen 
within one solution cluster may change their state when we go to another cluster, there they may even be unfrozen. 
While in general the above distinction can only provide an approximate description of clusters, it appears from nu- 
merical experiments that in many hard random CSP, like K-SAT or g-COL, such type of approximation is already 
rather accurate. 



B. The joker states 



Survey propagation turns out 0] to be able to deal with this clustering phenomenon for large (finite) sizes n. 
Although the original derivation uses sophisticated statistical physics ideas, one can also develop it more directly 
in algorithmic terms. The main idea is that we do not work any more with individual solutions s e Sc, but with 
complete clusters of solutions. As already said, some variables may be frozen inside a cluster, so they retain one single 
value Sj £ {l,...,q} in our description. Other variables may take several values n the cluster. For handling these 
variables, one can introduce an additional joker state which we denote by a . An even finer description, useful 
for general CSP, uses varieties of joker states, describing the set of values which are allowed for the variable, so that 
Sj G V, where V is the ensemble built from all subsets of {1, q}. To each cluster one should associate exactly one 
generalized value of each variable. One can then generalize the constraint to this enlarged space and work out the 
corresponding belief propagation equations. The resulting equations are the survey propagation equations. 

We shall not develop in more details this 'derivation', since it does not give any rigorous construction, but we will 
directly write the equations themselves, in terms of the original variables x G {1, <?}", and then analyze them. 
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C. Generalized messages 

We first need to define the generalizations of the warnings, the cavity-fields and the local field used in survey 
propagation. In order to lighten the notation and presentation, we shall drop the 'generalized', and use the same 
notation for generalized warning as we used for warnings in the BP section. The reader should remember that in SP 
all the messages are 'generalized' messages. 

For a given CSP, we define the generalized warnings: 

Definition IV. 1 For a given edge a — i of the factor graph, with i <E I and a £ A{i), let S be a given set of 
possible values for the variables (xj)j£i( a )\i "which are 'above' a. We define the warning as the q-component vector 

u a ~>i{x) €E {0, l} 9 with components: 

K^( s ) = ,™Ca [(xj)jei(a)\i\ %i <~ p] . P = 

(The generalized warning was called cavity-bias in [3j). Note that the set of possible warnings is enlarged in SP: For 
the example of 3-COL the null message (0, 0, 0) is added to (0, 0, 1), (0, 1, 0) and (1, 0, 0). As we have discussed before, 
the non-null messages are sent if the node "above" a function node is assigned a fixed color in the solution cluster. 
Correspondingly, the new message is sent if this vertex is not fixed to a single color, i.e. if it is in the joker state. 

Based on these warnings, we define local and cavity fields according to Def. IIII.3I with the argument (a single 
configuration) replaced by a set S of configurations: 

h?(S) = max «^,(S) , 

aeA(i) 

h?(S) = max u p b JS) . (12) 

3 a beA(j)\a b 3 



D. Histograms 

Histograms of warnings and fields are now defined as sums over clusters. The histogram of local fields is given by 

ff '(^-E% i(S5 )' (13) 

c a—l 

The histogram of the generalized warning on an edge a — i is now called the survey, denoted Q a ^i (u) . It is defined in 
terms of the clusters of solutions for the cavity graph where i has been taken away. Calling S^'^ the corresponding 
clusters, and r&*j their numbers, one defines: 

Q-^) = —) E 5 u,u a ^(s^) ■ ( 14 ) 

n cl a=l 



E. Survey propagation equations 

Based on these definitions, one can easily guess the generalized recurrence equations for the (approximate) prob- 
abilities Q a ^i(u) that implement the solutions in this enlarged configuration space. These SP equations lead to a 
small, yet fundamental, modification of the BP equations. The basic assumption is again, that incoming warnings 
are independent. But contradictory messages have to be explicitly forbidden. Having Fig. [31 in mind, and using the 
same notations as in sect llll Cl we use the incoming surveys (i.e. the set of surveys: with j 6 / (a) \i and 

b G A (j) \ a) to calculate the cavity-field distributions as in Q): 

H^ a (h)= hSi^ II Qb^(u b ^). (15) 

{S b ^j} beDj beA(j)\a 

Remember that these fields may lead to contradictions, if and only if hj^ a = (1,1,. ..,1) for at least one j. We 
therefore introduce the set of all non-contradictory cavity field configurations, 

M a ^i = {{h^ a } 3 ei(a)V I Vj : hj^ a E {0, l}\ h^ a ± (1, 1) } . (16) 
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Then, for an element of A4 a ^i we define again 

T ({h^a}) = {(»iW(.)\< I h*t a = 0, Vj € /(a) \ , (17) 

the set of allowed configuration for the variable nodes above function node a. Now the difference to BP enters: All 
elements of T({hj—>a}) naturally belong to the same cluster, i.e. they give rise to a single outgoing warning! The 
new warning is thus computed on the set of allowed configurations, it is given by Ua^i{T({hj^a})). Its distribution 
follows immediately, 

Qa^ l (u) = Z- 1 J2 <W_(T( { W)) II H^afa^a). (18) 

{hj^a}£M a -,i jel(a)\i 

The equations i|15ll8|) are the SP equations. Note that Eq. (|18fl produces a dramatic change in the iteration of the 
probabilities with respect to the BP Eq. (jHJ: Every allowed cavity-field configuration contributes only one term to 
the sum. Note also that contradictory messages have to be excluded explicitly by summing only over M a ^i- In 
belief propagation, for each configuration of input messages one takes the full collection of possible outputs, thereby 
introducing a bifurcation mechanism (which may easily become unstable). On the contrary, in SP the presence of 
multiple outputs is collapsed into the null message (which may even not be present in the belief propagation formalism 
as it happens for the coloring problem). A variable which receives a message having at least two zero components 
will be "unfrozen" in the corresponding cluster. 

The SP equations I|15ll8fl provide a closed set of equation for the surveys. Practically, this recurrence defines a 
map 

A : r^-OSr - r-Q^&f (19) 

and we are looking for a fixed point of this map, that will be obtained numerically by starting with some (random) 
initial {Qa_>j} and applying A iteratively: 

m times 

{Ql^i} = lim AV7A {Q°_J (20) 

m — >oo 

Such a fixed point will be called a "self-consistent" set of surveys. 



F. An example of Survey Propagation: 3-COL 

For the 3-COL example, because of the additional null message, the warning distribution now reads 

Qa^>i(u) = Va^ S S,(0,0,0) + ^-^3,(1,0,0) + ^^3,(0,1,0) + vl^n,(0,0,l) , (21) 

and the SP equations corresponding to fig0]are given by 



P = il=i(i - <^-) - gr^jw^j + Vb^j) + n=i (22) 
"" ' ~ Y.U nf=i(i - %w - e*=i nf =1 «^ + %w + nli <. 



for p S {1, 2, 3}. Then J^^can be computed by normalization, i.e: 

rf a ^ = 1 - (vl+i + rL+i + Vl^i) (23) 

The interpretation of this equation is again straightforward, for simplicity we explain it just for color 1: Now J7„_^ 
is given by the probability that Sj is forced to take value 1, i.e. by the probability that the cavity field equals 
hj^ a = (0,1,1), conditioned to non-contradictory fields. The numerator calculates the unconditioned probability: 
The first term excludes hj_ t . a which would forbid color 1 to Sj. The second term takes out fields (0, h 2 , 0) and (0, 0, h 3 ) 
where Sj would be allowed to also take at least one other color. The last term adds again the probability of field (0, 0, 0) 
which was subtracted twice in the second term. The denominator realizes the conditioning to non-contradictory fields, 
i.e. it gives the probability that hj^ a ^ (1, 1, 1). The counting of possible cases follows again the inclusion-exclusion 
principle: In the first term, we count fields that have a zero component in color r, summed over r. We have to subtract 
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the double countings due to fields having two zero-components, and we have to add once the field (0, 0, 0) which was 
added three times in the first term, and subtracted three times in the second one. 

Note that the symmetry between colors leads immediately to a solution r)*^ = 1 for all edges (i,a) of the factor 
graph, i.e. only null-messages are sent. This is, however, not the correct solution in the clustered regime, the color 
symmetry is not valid at the level of solution clusters. In fact, the appearance of a non-trivial solution for the rfc^ 
marks the onset of clustering. 



G. The K-SAT case 



In the sat case, q — 2, so possible u messages are (0,0), (1,0), (0, 1) and (1, 1). As any clause can be satisfied by 
any given variable (by choosing its value according with the negation of its corresponding literal) , the (1,1) message 
will never show up. Moreover, for a given a — > i, which of (1,0) or (0,1) can appear on u a ^i will be completely 
determined by the sign of the corresponding literal. So we can parameterize distributions Q a ~>i with only one real 
number f] a ^i being the probability of the nontrivial u a ^i message ((1,0) or (0,1)). The probabi lity of (0,0) will 
simply be 1 — r\ a ^i ■ The corresponding equations - which have been written and implemented in |2j, Il2| - read in the 
case of 3-sat: 
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A^{j),A s a {j) are the two sets in which A(j) is decomposed (A(j) — A%(j) U A s a {j)) where the indexes s (resp: u) refer 
to the neighbors b for which the literals (b, j) and (a, j) agree (resp: disagree). This separation corresponds to the the 
distinction of which neighbors contribute to make variable j satisfy or not-satisfy the clause a. 

For example, the product Of>e.A s (j) (1 — Vb^j) gives the probability that no nontrivial message arrives on j from 
the function nodes b <= A s a (j) (empty products are set to 1 by definition). 



H. The q-COL case 



We have already discussed in detail the 3-COL problem, a general number q of colors can be handled analogously. 
Messages u tt ^ t are elements of {0, l} 9 forbidding the colors which have 1 in their corresponding coordinate. We can 
immediately see that the possible types of u a ->i message are (0, . . . , 0, 1, 0, . . . 0) (a 1 on the color taken by the variable 
in the other end of the link), plus the additional null message (0, . . . , 0) (if the neighboring variable is in the joker 
state). So we can parameterize Q a ^i by only q real numbers. 

Looking to Figure [5] suppose variable i has color p forbidden (i.e. the message u a —>i has a 1 on component p). This 
implies that on this configuration, variable j is forced to take color p, that is /jj_> a is of the form (1, . . . , 1, 0, 1, . . . , 1), 
with a single in the p-th position ("freezing" type). For all other hj-y a the variable j is in the joker state and the 
output message will be (0, . . . , 0). 



I. Decimation 



Once the convergence is reached in Eq. (|2*U|l (we stop when max aey4 ie/ ( a ) \ old Q a ^i — new Qn^A becomes small 
enough), we can use the information computed so far to find a solution to the original problem |2j,|3j. We can easily 
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Figure 5: Iteration for g-COL 



compute (approximately) the local- field distributions {H{\ ieI introduced in Eq. I|13l) by considering all neighboring 
function nodes, and forbidding contradictory messages (remember that in the cavity graph we have deleted the 
constraints containing variable i, whereas in Hi we have to restrict the sum to messages being "extensible" to solutions 
of the complete problem): 

H^^Z'- 1 Yl (i-^.d 1)) 5 hM II Q^a-M) , (26) 

with hi determined according to Eq. (|12fl . 

The value Hi((X, 1,0, 1, 1)), with a single zero entry at component p, gives now the probability of a variable 
i to be frozen to a certain value p. A simple decimation procedure can be implemented: Select the most frozen 
variable and fix it to its most frozen value, then simplify the problem: Certain constraints may already be satisfied 
independently of the values of other participating variables, and can be deleted from the problem instance. Other 
constraints may now immediately fix single variables to one state (unit-clause resolution). Reconverge the warning 
distributions on the smaller subproblem. 

The decimation algorithm can have three types of behaviors: 

1. The algorithm is able to solve the problem fixing all, or almost all variables (some variables may be still unfixed 
even if the problem is already solved). 

2. The surveys converge at some stage to the trivial solution concentrated on null messages, Q a ^i(u) — 5/i,(q,...,o) 
for all (i, a) e E. In this case SP has nothing more to offer. Luckily, these problems are generally under- 
constrained and then easy to solve by other means. Note that, for g-COL, the trivial solution exists always. In 
numerical experiments, we found that in case of existence of another solution, the latter was the correct one. 
In this case it is therefore reasonable to restart the iteration of the SP equations starting from a new random 
initial condition, even if a trivial solution was found once. Only if no non-trivial solution can be found after 
several restarts, the subproblem is passed to a different solver. 

3. The SP algorithm does not converge at some stage, even if the initial problem was satisfiable. 

On large random instances of 3-SAT Q, 0, 0, Q or g-COL in the hard sat region, but not too close to the 
satisfiability threshold, numerical experiments show that the algorithm behaves as in case 2). 

The generated subproblems turn out to be very simple to solve by other conventional heuristics, e.g. walksat ^| 
or unmodified belief propagation. 

Case 3) happens in general very close to the SAT/UNSAT transition. It is not yet clear if this problem appears 
due to the existence of finite loops in the original problem (which make the SP equations to be only approximate), 
due to the simple decimation heuristic which fixes always the most frozen variable, or due to some problems which 
go beyond the validity of the SP equation itself. 
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V. WHAT'S NEXT 



We would like to remark two possible directions of research, among all those that may follow from the presented 
algorithm. One is to formalize rigorously the notions suggested in Section lTVl allowing for some well defined definitions 
of the clusters, and a corresponding derivation of the SP equations. 

Another one, of big computational relevance, is to generalize SP, which was presented in its purest form, to deal 
with correlations between warnings that arise from local problem structures like small loops in the factor graph, cf. 
for similar generalizations of BP. A second possible generalization would include diverse structures of the space of 
solutions, e.g. in a sense of clusters of solution clusters etc. In the language of statistical physics, this would include 
"more than one step of replica-symmetry breaking" . 

After completing this work, we learned from G. Parisi that he has reached a similar conclusion on the interpretation 
of SP in the colouring problem through the addition of an extra state for the variables • 
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